Proceedings of the ICML / UAI / COLT Workshop on Abstraction in Reinforcement Learning

نویسندگان

  • Özgür Şimşek
  • George Konidaris
  • Balaraman Ravindran
  • Alicia P. Wolfe
  • Keith Bush
  • Joelle Pineau
  • Massimo Avoli
  • Lutz Frommberger
  • Sridhar Mahadevan
چکیده

Bayesian Reinforcement Learning (BRL) provides an optimal solution to on-line learning while acting, but it is computationally intractable for all but the simplest problems: at each decision time, an agent should weigh all possible courses of action by beliefs about future outcomes constructed over long time horizons. To improve tractability, previous research has focused on sparsely sampling possible courses of action that are most relevant to computing value; however, sampling alone does not scale well to larger environments. In this paper, we investigate whether an abstraction called projects— parts of the transition dynamics that bias the look ahead to areas of the environment that are promising—can scale up BRL to larger environments. We modify a sparse sampler to incorporate projects. We test our algorithm on standard problems that require effective exploration–exploitation balance and show that learning can be significantly sped up compared to a simpler BRL and the classic Q-learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-objective optimisation of relevance vector machines: selecting sparse features for face verification ICML/UAI/COLT 2008 Workshop on Sparse Optimisation and Variable Selection

The relevance vector machine (RVM) (Tipping, 2001) encapsulates a sparse probabilistic model for machine learning tasks. Like support vector machines, use of the kernel trick allows modelling in high dimensional feature spaces to be achieved at low computational cost. However, sparsity is controlled not just by the automatic relevance determination (ARD) prior but also by the choice of basis fu...

متن کامل

Improving medical predictive models via Likelihood Gamble Pricing

A combination of radiotherapy and chemotherapy, is often the treatment of choice for cancer patients. Recent developments in the treatment of patients have lead to improved survival. However, traditionally used clinical variables have poor accuracy for the prediction of survival and radiation treatment side effects. The objective of this work is to develop and validate improved predictive model...

متن کامل

Abstraction Selection in Model-based Reinforcement Learning

ion Selection in Model-Based Reinforcement Learning Nan Jiang, Alex Kulesza, Satinder Singh {NANJIANG,KULESZA,BAVEJA}@UMICH.EDU Computer Science & Engineering, University of Michigan

متن کامل

Discovering Hierarchy in Reinforcement Learning with HEXQ

An open problem in reinforcement learning is discovering hierarchical structure. HEXQ, an algorithm which automatically attempts to decompose and solve a model-free factored MDP hierarchically is described. By searching for aliased Markov sub-space regions based on the state variables the algorithm uses temporal and state abstraction to construct a hierarchy of interlinked smaller MDPs.

متن کامل

from Dagstuhl Seminar 13321 Reinforcement Learning

This Dagstuhl Seminar also stood as the 11th European Workshop on Reinforcement Learning (EWRL11). Reinforcement learning gains more and more attention each year, as can be seen at the various conferences (ECML, ICML, IJCAI, . . . ). EWRL, and in particular this Dagstuhl Seminar, aimed at gathering people interested in reinforcement learning from all around the globe. This unusual format for EW...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009